{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deep Q-Network Variant Hindsight Replay DQN (HER-DQN) implementation (PyTorch).\n",
    "\n",
    "In this notebook, we will implement HER-DQN variant of DQN. We saw DQN in `6.a_dqn_pytorch.ipynb`. We will borrow some parts of it. However, in this notebook we will create a different environment which has no reward till you hit the target - an example of sparse reward setup. We will see how the approach of HER helps us make it efficient to learn in the sparse reward setup. \n",
    "\n",
    "### RECAP\n",
    "\n",
    "Q Learning control is carried out by sampling step by step and updating Q values at each step. We use $\\epsilon$-greedy policy to explore and generate samples. However, the policy learnt is a deterministic greedy policy with no exploration. We can carryout updates online i.e. we take a step and use `(current state, action, reward and next_state)` tuple to update.\n",
    "\n",
    "In case of function approximation using neural network, the input to the network is the state and output is the $q(s,a)$ for all the actions in the state $s$. It is denoted as $ \\hat{q}(s_t, a_t; w_{t}) $, where $w_{t}$ is the weigths of the neural network that we learn as part of DQN learning.\n",
    "\n",
    "We use two networks, one target network with weight $w^-_t$ to get the max $q$-value of next state with best action denoted by $ \\max\\limits_a \\hat {q}(S_{t+1},a; w^{-}_{t}) $ and network with weights $w_t^-$ which we periodically updated from primary network $w_t$.\n",
    "\n",
    "The Update equation is given below. This is the online version:\n",
    "$$ w_{t+1} \\leftarrow w_t + \\alpha [ R_{t+1} + \\gamma . \\max_{a} \\hat{q}(S_{t+1},a;w^{-}_{t}) – \\hat{q}(S_t,A_t;w_t)] \\nabla_{w_t} \\hat{q}(S_t,A_t;w_t)$$\n",
    "\n",
    "Online update with neural network with millions of weights does not work well. Accordingly, We use experience replay (aka Replay Buffer).  We use a behavior policy to explore the environment and store the samples `(s, a, r, s', done)` in a buffer. The samples are generated using an exploratory behavior policy while we improve a deterministic target policy using q-values.\n",
    "\n",
    "Therefore, we can always use older samples from behavior policy and apply them again and again. We can keep the buffer size fixed to some pre-determined size and keep deleting the older samples as we collect new ones. This process makes learning sample efficient by reusing a sample multiple times and also removing temporal dependence of the samples we would otherwise see while following a trajectory.\n",
    "\n",
    "The update equation with batch update with minor modifications is given below. We collect samples of transitions `(current state, action, reward, next state)` in a buffer, where each sample is denoted as a tuple:\n",
    "\n",
    "$$ (s_{i}, a_{i}, r_{i}, s'_{i}, done_{i})$$\n",
    "\n",
    "Subscript ($i$) denotes ith sample. We take $N$ samples from experience replay selecting randomly and update the weights. Subscript ($t$) denotes the index of weight updates. If the current state is done, as denoted by `done` flag, the target is just the reward as terminal states have zero value. The final update equation is as given below:\n",
    "\n",
    "$$w_{t+1} \\leftarrow w_t + \\alpha \\frac{1}{N} \\sum_{i=1}^{N} \\left[ r_i + \\left( (1-done_i) . \\gamma .  \\max_{a'} \\hat{q}(s'_{i},a';w^{-}_{t}) \\right) – \\hat{q}(s_i,a_i;w_t) \\right] \\nabla_{w_t} \\hat{q}(s_i,a_i;w_t)$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Running in Colab/Kaggle\n",
    "\n",
    "If you are running this on Colab, please uncomment below cells and run this to install required dependencies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Uncomment and execute this cell to install all the the dependencies if running in Google Colab\n",
    "\n",
    "# !apt-get update && apt-get install swig cmake ffmpeg freeglut3-dev xvfb\n",
    "# !pip install box2d-py\n",
    "# !pip install \"stable-baselines3[extra]>=2.1\"\n",
    "# !pip install \"huggingface_sb3>=3.0\"\n",
    "\n",
    "# !pip install git+https://github.com/DLR-RM/rl-baselines3-zoo@update/hf\n",
    "# !git clone https://github.com/DLR-RM/rl-baselines3-zoo\n",
    "# %cd rl-baselines3-zoo/\n",
    "# !pip install -r requirements.txt\n",
    "# %cd ..\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2023-12-16 16:41:57.066879: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
      "2023-12-16 16:41:57.070524: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.\n",
      "2023-12-16 16:41:57.123044: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
      "2023-12-16 16:41:57.123172: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
      "2023-12-16 16:41:57.123224: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
      "2023-12-16 16:41:57.136470: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.\n",
      "2023-12-16 16:41:57.138738: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
      "To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
      "2023-12-16 16:41:58.313911: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n"
     ]
    }
   ],
   "source": [
    "import random\n",
    "import math\n",
    "import numpy as np\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "import torch.optim as optim\n",
    "import gymnasium as gym\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy.signal import convolve, gaussian\n",
    "from stable_baselines3.common.vec_env import VecVideoRecorder, DummyVecEnv\n",
    "from base64 import b64encode\n",
    "import time\n",
    "from tqdm import trange\n",
    "import glob\n",
    "from collections import namedtuple\n",
    "\n",
    "from IPython.display import HTML, clear_output\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "device(type='cpu')"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
    "device"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Experience replay\n",
    "\n",
    "We will use the replay buffer we saw in chapter 4 listings. Replay buffer is very important in DQN to break the correlation between samples. We use a behavior policy ($\\epsilon$-greedy) to sample from the environment and store the transitions `(s,a,r,s',done)` into a buffer. These samples are used multiple times in a learning making the process sample efficient.\n",
    "\n",
    "The interface to ReplayBuffer is:\n",
    "* `exp_replay.add(state, action, reward, next_state, done)` - saves (s,a,r,s',done) tuple into the buffer\n",
    "* `exp_replay.sample(batch_size)` - returns states, actions, rewards, next_states and done_flags for `batch_size` random samples.\n",
    "* `len(exp_replay)` - returns number of elements stored in replay buffer.\n",
    "\n",
    "We have modified the implementation a bit to make it more efficient"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "class ReplayBuffer:\n",
    "    def __init__(self, size):\n",
    "        self.size = size #max number of items in buffer\n",
    "        self.buffer =[] #array to holde buffer\n",
    "        self.next_id = 0\n",
    "    \n",
    "    def __len__(self):\n",
    "        return len(self.buffer)\n",
    "    \n",
    "    def add(self, state, action, reward, next_state, done):\n",
    "        item = (state, action, reward, next_state, done)\n",
    "        if len(self.buffer) < self.size:\n",
    "           self.buffer.append(item)\n",
    "        else:\n",
    "            self.buffer[self.next_id] = item\n",
    "        self.next_id = (self.next_id + 1) % self.size\n",
    "        \n",
    "    def sample(self, batch_size):\n",
    "        idxs = np.random.choice(len(self.buffer), batch_size)\n",
    "        samples = [self.buffer[i] for i in idxs]\n",
    "        states, actions, rewards, next_states, done_flags = list(zip(*samples))\n",
    "        return np.array(states), np.array(actions), np.array(rewards), np.array(next_states), np.array(done_flags)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Learning with DQN\n",
    "Here we write a function similar to tabular q-learning. We will calculate average TD error per batch using the equation:\n",
    "\n",
    "$$ L =  \\frac{1}{N} \\sum_{i=1}^{N} \\left[ r_i + \\left( (1-done_i) . \\gamma .  \\max_{a'} \\hat{q}(s'_i,a';w^-_t) \\right) – \\hat{q}_{w_t}(s_i,a_i;w_t) \\right]^2$$\n",
    "\n",
    "\n",
    "$$ \\nabla_{w_t} L =   - \\frac{1}{N} \\sum_{i=1}^{N} \\left[ r_i + \\left( (1-done_i) . \\gamma .  \\max_{a'} \\hat{q}(s'_i,a';w^-_t) \\right) – \\hat{q}(s_i,a_i;w_t) \\right] \\nabla \\hat{q}_{w_t}(s_i,a_i;w_t)$$\n",
    "\n",
    "\n",
    "$\\hat{q}(s',a';w^{-})$ is calculated using target network whose weights are held constant and refreshed periodically from the agent learning network.\n",
    "\n",
    "Target is given by following:\n",
    "* non terminal state: $r_i +  \\gamma .  \\max\\limits_{a'} \\hat{q}(s'_i,a';w^-_t)$\n",
    "* terminal state: $ r_i $\n",
    "\n",
    "We then carryout back propagation through the agent network to update the weights using equation below:\n",
    "\n",
    "\n",
    "$$\n",
    "w_{t+1} \\leftarrow w_t - \\alpha . \\nabla_{w_t}L$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def td_loss_dqn(agent, target_network, states, actions, rewards, next_states, done_flags,\n",
    "                    gamma=0.99, device=device):\n",
    "\n",
    "    # convert numpy array to torch tensors\n",
    "    states = torch.tensor(states, device=device, dtype=torch.float)\n",
    "    actions = torch.tensor(actions, device=device, dtype=torch.long)\n",
    "    rewards = torch.tensor(rewards, device=device, dtype=torch.float)\n",
    "    next_states = torch.tensor(next_states, device=device, dtype=torch.float)\n",
    "    done_flags = torch.tensor(done_flags.astype('float32'),device=device,dtype=torch.float)\n",
    "\n",
    "    # get q-values for all actions in current states\n",
    "    # use agent network\n",
    "    predicted_qvalues = agent(states)\n",
    "\n",
    "    # compute q-values for all actions in next states\n",
    "    # use target network\n",
    "    predicted_next_qvalues = target_network(next_states)\n",
    "\n",
    "    # select q-values for chosen actions\n",
    "    predicted_qvalues_for_actions = predicted_qvalues[range(\n",
    "        len(actions)), actions]\n",
    "\n",
    "    # compute Qmax(next_states, actions) using predicted next q-values\n",
    "    next_state_values,_ = torch.max(predicted_next_qvalues, dim=1)\n",
    "\n",
    "    # compute \"target q-values\"\n",
    "    target_qvalues_for_actions = rewards + gamma * next_state_values * (1-done_flags)\n",
    "\n",
    "    # mean squared error loss to minimize\n",
    "    loss = torch.mean((predicted_qvalues_for_actions -\n",
    "                       target_qvalues_for_actions.detach()) ** 2)\n",
    "\n",
    "    return loss"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Hindsight Replay\n",
    "\n",
    "In the paper by OpenAI in 2018, https://arxiv.org/pdf/1707.01495.pdf, the authors presented a sample efficient approach to learn in the environment where the rewards are sparse and binary. The common approach is to shape the reward function in a way to guide the agents towards optimization. This is not generalizable. It requires a deep understanding of the domain to design a suitable reward function.\n",
    "\n",
    "Compared to RL agents, which learn from a successful outcome, humans seem to learn not just from that but also from unsuccessful outcomes. This is the basis of the idea proposed in Hindsight replay approach known as **HER**. While **HER** can be combined with various RL approaches. In the code cells below we will use HER with Dueling DQN giving us **HER-DQN**\n",
    "\n",
    "In HER approach, after an episode is played out which let us say was not successful, we form a secondary objective where the original goal is replaced with the last state before termination as a goal for this trajectory since this trajectory ended in that state. \n",
    "\n",
    "Say a episode has been played out $ s_0, s_1, .... s_T$. Normally we store in Replay buffer a tuple of $(s_t, a_t, r, s_{t+1}, done)$. Let us say the goal for this episode was $g$ which could not be achieved in this run. In HER approach we will store following to the replay buffer:\n",
    "\n",
    "* $(s_t||g, a_t, r, s_{t+1}||g, done)$\n",
    "* $(s_t||g', a_t, r(s_t, a_t, g'), s_{t+1}||g', done)$: other state transitions based on synthetic goals like last state of the episode as a sub-goal say g'. The reward is modified to show how state transition $s_t \\rightarrow s_{t+1}$, was good or bad for the sub-goal of $g'$. \n",
    "\n",
    "Original paper discusses various strategies for forming these subgoals. We will use one of them called `future`:\n",
    "* future — replay with k random states which come from the same episode as the transition being replayed and were observed after it.\n",
    "\n",
    "We also use a different kind of environment from our past notebooks. We will use an environment as used in the paper that of bit-flipping experiment. Say you have a vector with n-bits, each being binary in the range {0,1}. Therefore there are $2^n$ combinations possible. At reset, environment starts in a n-bit configuration randomly and the goal is also randomly picked to be some different n-bit configuration. Each action is to flip a bit. The bit to be flipped is the policy $\\pi(a|s)$ that agent is trying to learn. An episode ends if the agent is able to find the right configuration matching the goal or when agent has exhausted `n` actions in an episode.\n",
    "\n",
    "The authors show that a regular DQN, where the state (configuration of n-bits) is represented as a deep network, it is almost impossible for a regular DQN agent to learn beyond 15 digit combinations. However, coupled with HER-DQN approach, the agent is able to learn easily even for large digit combinations like 50 or so. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BitFlipEnvironment:\n",
    "\n",
    "    def __init__(self, bits):\n",
    "        self.bits = bits\n",
    "        self.state = np.zeros((self.bits, ))\n",
    "        self.goal = np.zeros((self.bits, ))\n",
    "        self.reset()\n",
    "\n",
    "    def reset(self):\n",
    "        self.state = np.random.randint(2, size=self.bits).astype(np.float32)\n",
    "        self.goal = np.random.randint(2, size=self.bits).astype(np.float32)\n",
    "        if np.allclose(self.state, self.goal):\n",
    "            self.reset()\n",
    "        return self.state.copy(), self.goal.copy()\n",
    "\n",
    "    def step(self, action):\n",
    "        self.state[action] = 1 - self.state[action]  # Flip the bit on position of the action\n",
    "        reward, done = self.compute_reward(self.state, self.goal)\n",
    "        return self.state.copy(), reward, done\n",
    "\n",
    "    def render(self):\n",
    "        print(\"State: {}\".format(self.state.tolist()))\n",
    "        print(\"Goal : {}\\n\".format(self.goal.tolist()))\n",
    "\n",
    "    @staticmethod\n",
    "    def compute_reward(state, goal):\n",
    "        done = np.allclose(state, goal)\n",
    "        return 0.0 if done else -1.0, done\n",
    "\n",
    "# a simplified version of DuelingDQN with lesser number of layers\n",
    "class DuelingMLP(nn.Module):\n",
    "\n",
    "    def __init__(self, state_size, n_actions, epsilon=1.0):\n",
    "        super().__init__()\n",
    "        self.state_size = state_size\n",
    "        self.n_actions = n_actions\n",
    "        self.epsilon = epsilon\n",
    "        self.linear = nn.Linear(state_size, 256)\n",
    "        self.v = nn.Linear(256, 1)\n",
    "        self.adv = nn.Linear(256, n_actions)\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = F.relu(self.linear(x))\n",
    "        v = self.v(x)\n",
    "        adv = self.adv(x)\n",
    "        qvalues = v + adv - adv.mean(dim=1, keepdim=True)\n",
    "        return qvalues\n",
    "    \n",
    "    def get_qvalues(self, states):\n",
    "        # input is an array of states in numpy and output is Qvals as numpy array\n",
    "        states = torch.tensor(np.array(states), device=device, dtype=torch.float32)\n",
    "        qvalues = self.forward(states)\n",
    "        return qvalues.data.cpu().numpy()\n",
    "\n",
    "    def sample_actions(self, qvalues):\n",
    "        # sample actions from a batch of q_values using epsilon greedy policy\n",
    "        epsilon = self.epsilon\n",
    "        batch_size, n_actions = qvalues.shape        \n",
    "        random_actions = np.random.choice(n_actions, size=batch_size)\n",
    "        best_actions = qvalues.argmax(-1)\n",
    "        should_explore = np.random.choice(\n",
    "            [0, 1], batch_size, p=[1-epsilon, epsilon])\n",
    "        return np.where(should_explore, random_actions, best_actions)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def train_her(env, agent, target_network, optimizer, td_loss_fn):\n",
    "\n",
    "    success_rate = 0.0\n",
    "    success_rates = []\n",
    "    \n",
    "    exp_replay = ReplayBuffer(10**6)\n",
    "    \n",
    "    for epoch in range(num_epochs):\n",
    "\n",
    "        # Decay epsilon linearly from eps_max to eps_min\n",
    "        eps = max(eps_max - epoch * (eps_max - eps_min) / int(num_epochs * exploration_fraction), eps_min)\n",
    "        print(\"Epoch: {}, exploration: {:.0f}%, success rate: {:.2f}\".format(epoch + 1, 100 * eps, success_rate))\n",
    "        agent.epsilon = eps\n",
    "        target_network.epsilon = eps\n",
    "\n",
    "        successes = 0\n",
    "        for cycle in range(num_cycles):\n",
    "\n",
    "            for episode in range(num_episodes):\n",
    "\n",
    "                # Run episode and cache trajectory\n",
    "                episode_trajectory = []\n",
    "                state, goal = env.reset()\n",
    "\n",
    "                for step in range(num_bits):\n",
    "\n",
    "                    state_ = np.concatenate((state, goal))\n",
    "                    qvalues = agent.get_qvalues([state_])\n",
    "                    action = agent.sample_actions(qvalues)[0]\n",
    "                    next_state, reward, done = env.step(action)\n",
    "                    \n",
    "                    episode_trajectory.append((state, action, reward, next_state, done))\n",
    "                    state = next_state\n",
    "                    if done:\n",
    "                        successes += 1\n",
    "                        break\n",
    "\n",
    "                # Fill up replay memory\n",
    "                steps_taken = step\n",
    "                for t in range(steps_taken):\n",
    "\n",
    "                    # Usual experience replay\n",
    "                    state, action, reward, next_state, done = episode_trajectory[t]\n",
    "                    state_, next_state_ = np.concatenate((state, goal)), np.concatenate((next_state, goal))\n",
    "                    exp_replay.add(state_, action, reward, next_state_, done)\n",
    "\n",
    "                    # Hindsight experience replay\n",
    "                    for _ in range(future_k):\n",
    "                        future = random.randint(t, steps_taken)  # index of future time step\n",
    "                        new_goal = episode_trajectory[future][3]  # take future next_state from (s,a,r,s',d) and set as goal\n",
    "                        new_reward, new_done = env.compute_reward(next_state, new_goal)\n",
    "                        state_, next_state_ = np.concatenate((state, new_goal)), np.concatenate((next_state, new_goal))\n",
    "                        exp_replay.add(state_, action, new_reward, next_state_, new_done)\n",
    "\n",
    "            # Optimize DQN\n",
    "            for opt_step in range(num_opt_steps):\n",
    "                # train by sampling batch_size of data from experience replay\n",
    "                states, actions, rewards, next_states, done_flags = exp_replay.sample(batch_size)\n",
    "                # loss = <compute TD loss>\n",
    "                optimizer.zero_grad()\n",
    "                loss = td_loss_fn(agent, target_network, \n",
    "                                  states, actions, rewards, next_states, done_flags,                  \n",
    "                                  gamma=0.99,\n",
    "                                  device=device)\n",
    "                loss.backward()\n",
    "                optimizer.step()\n",
    "        \n",
    "            target_network.load_state_dict(agent.state_dict())\n",
    "\n",
    "        success_rate = successes / (num_episodes * num_cycles)\n",
    "        success_rates.append(success_rate)\n",
    "\n",
    "    # print graph\n",
    "    plt.plot(success_rates, label=\"HER-DQN\")\n",
    "\n",
    "    plt.legend()\n",
    "    plt.xlabel(\"Epoch\")\n",
    "    plt.ylabel(\"Success rate\")\n",
    "    plt.title(\"Number of bits: {}\".format(num_bits))\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_bits = 50 \n",
    "n_actions = num_bits\n",
    "state_size = 2*num_bits\n",
    "\n",
    "future_k = 4\n",
    "num_epochs = 40\n",
    "num_cycles = 50\n",
    "num_episodes = 16\n",
    "num_opt_steps = 40\n",
    "eps_max=0.2\n",
    "eps_min=0.0\n",
    "exploration_fraction=0.5\n",
    "batch_size = 128\n",
    "\n",
    "\n",
    "env = BitFlipEnvironment(num_bits)\n",
    "\n",
    "agent = DuelingMLP(state_size, n_actions, epsilon=1).to(device)\n",
    "target_network = DuelingMLP(state_size, n_actions, epsilon=1).to(device)\n",
    "target_network.load_state_dict(agent.state_dict())\n",
    "optimizer = torch.optim.Adam(agent.parameters(), lr=1e-3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 1, exploration: 20%, success rate: 0.00\n",
      "Epoch: 2, exploration: 19%, success rate: 0.00\n",
      "Epoch: 3, exploration: 18%, success rate: 0.00\n",
      "Epoch: 4, exploration: 17%, success rate: 0.00\n",
      "Epoch: 5, exploration: 16%, success rate: 0.00\n",
      "Epoch: 6, exploration: 15%, success rate: 0.00\n",
      "Epoch: 7, exploration: 14%, success rate: 0.00\n",
      "Epoch: 8, exploration: 13%, success rate: 0.01\n",
      "Epoch: 9, exploration: 12%, success rate: 0.03\n",
      "Epoch: 10, exploration: 11%, success rate: 0.04\n",
      "Epoch: 11, exploration: 10%, success rate: 0.09\n",
      "Epoch: 12, exploration: 9%, success rate: 0.15\n",
      "Epoch: 13, exploration: 8%, success rate: 0.21\n",
      "Epoch: 14, exploration: 7%, success rate: 0.29\n",
      "Epoch: 15, exploration: 6%, success rate: 0.43\n",
      "Epoch: 16, exploration: 5%, success rate: 0.52\n",
      "Epoch: 17, exploration: 4%, success rate: 0.63\n",
      "Epoch: 18, exploration: 3%, success rate: 0.75\n",
      "Epoch: 19, exploration: 2%, success rate: 0.81\n",
      "Epoch: 20, exploration: 1%, success rate: 0.89\n",
      "Epoch: 21, exploration: 0%, success rate: 0.92\n",
      "Epoch: 22, exploration: 0%, success rate: 0.95\n",
      "Epoch: 23, exploration: 0%, success rate: 0.97\n",
      "Epoch: 24, exploration: 0%, success rate: 0.98\n",
      "Epoch: 25, exploration: 0%, success rate: 1.00\n",
      "Epoch: 26, exploration: 0%, success rate: 0.99\n",
      "Epoch: 27, exploration: 0%, success rate: 1.00\n",
      "Epoch: 28, exploration: 0%, success rate: 1.00\n",
      "Epoch: 29, exploration: 0%, success rate: 1.00\n",
      "Epoch: 30, exploration: 0%, success rate: 1.00\n",
      "Epoch: 31, exploration: 0%, success rate: 1.00\n",
      "Epoch: 32, exploration: 0%, success rate: 1.00\n",
      "Epoch: 33, exploration: 0%, success rate: 1.00\n",
      "Epoch: 34, exploration: 0%, success rate: 1.00\n",
      "Epoch: 35, exploration: 0%, success rate: 1.00\n",
      "Epoch: 36, exploration: 0%, success rate: 1.00\n",
      "Epoch: 37, exploration: 0%, success rate: 1.00\n",
      "Epoch: 38, exploration: 0%, success rate: 1.00\n",
      "Epoch: 39, exploration: 0%, success rate: 1.00\n",
      "Epoch: 40, exploration: 0%, success rate: 1.00\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAHHCAYAAABDUnkqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/SrBM8AAAACXBIWXMAAA9hAAAPYQGoP6dpAABSPUlEQVR4nO3dd1hT598G8DthhB1ANqKg4FZwgdRZxdk6WttatXW01TqrpX2ruK1VtNbRn1qt2tYuq9WqHW5xK07cCi4UF0uFsEdy3j+Q1JQhgcAhyf25rlySk3OS7+Eo3D7nGRJBEAQQERERGQip2AUQERER6RLDDRERERkUhhsiIiIyKAw3REREZFAYboiIiMigMNwQERGRQWG4ISIiIoPCcENEREQGheGGiIiIDArDDRGVy8GDByGRSLB582axSymThIQEvPHGG6hRowYkEgmWLl1a7H537tyBRCLBV1999cL3nDVrFiQSiY4rJaKKYrghqsbWrVsHiUQCCwsLPHjwoMjrnTp1QpMmTUSoTP98/PHH2L17N8LCwvDzzz+jR48elfI58+bNw7Zt2yrlvZ9XGKz++7CwsCh2/++++w4NGzaEhYUF/Pz8sGzZskqvkUgspmIXQEQvlpOTg/nz5/MXUgXs378fffv2xaeffqqz95w2bRomT56ssW3evHl444030K9fP519TmlWrlwJGxsb9XMTE5Mi+3z77bcYNWoU+vfvj9DQUBw5cgQfffQRMjMzMWnSpCqpk6gqMdwQ6YGAgACsWbMGYWFh8PDwELucKpWRkQFra+sKv09iYiLs7e0rXtBzTE1NYWoq7o/RN954A05OTiW+npWVhalTp+KVV15R30IcMWIEVCoV5syZg5EjR8LBwaGqyiWqErwtRaQHpkyZAqVSifnz55e6X2F/kXXr1hV5TSKRYNasWernhbc1rl+/jnfeeQdyuRzOzs6YPn06BEHAvXv30LdvX9jZ2cHNzQ2LFi0q9jOVSiWmTJkCNzc3WFtbo0+fPrh3716R/U6ePIkePXpALpfDysoKHTt2xLFjxzT2Kazp6tWrGDRoEBwcHNCuXbtSz/n27dt488034ejoCCsrK7Rp0wbbt29Xv154a08QBKxYsUJ9+6YslixZgtq1a8PS0hIdO3bE5cuXi623kEQiQUZGBn788Uf15wwbNgwAkJaWhokTJ8Lb2xsymQwuLi7o2rUroqKi1MdnZmYiOjoaycnJZaoPAARBgEKhgCAIxb5+4MABPH78GGPGjNHYPnbsWGRkZGh8r4gMBcMNkR7w8fHBkCFDsGbNGjx8+FCn7z1gwACoVCrMnz8fQUFB+OKLL7B06VJ07doVnp6eWLBgAXx9ffHpp5/i8OHDRY6fO3cutm/fjkmTJuGjjz7C3r17ERISgqysLPU++/fvR4cOHaBQKDBz5kzMmzcPKSkp6Ny5M06dOlXkPd98801kZmZi3rx5GDFiRIm1JyQk4KWXXsLu3bsxZswYzJ07F9nZ2ejTpw+2bt0KAOjQoQN+/vlnAEDXrl3x888/q5+X5qeffsL//vc/jB07FmFhYbh8+TI6d+6MhISEEo/5+eefIZPJ0L59e/XnfPjhhwCAUaNGYeXKlejfvz+++eYbfPrpp7C0tMS1a9fUx586dQoNGzbE8uXLX1hfoTp16kAul8PW1hbvvPNOkfrOnTsHAGjVqpXG9pYtW0IqlapfJzIoAhFVWz/88IMAQDh9+rRw69YtwdTUVPjoo4/Ur3fs2FFo3Lix+nlsbKwAQPjhhx+KvBcAYebMmernM2fOFAAII0eOVG/Lz88XatasKUgkEmH+/Pnq7U+fPhUsLS2FoUOHqrcdOHBAACB4enoKCoVCvf33338XAAhff/21IAiCoFKpBD8/P6F79+6CSqVS75eZmSn4+PgIXbt2LVLTwIEDy/T9mThxogBAOHLkiHpbWlqa4OPjI3h7ewtKpVLj/MeOHfvC9yz8HlpaWgr3799Xbz958qQAQPj444+L1Ps8a2trje9TIblc/sLPL/yePn+dSrJ06VJh3Lhxwq+//ips3rxZmDBhgmBqair4+fkJqamp6v3Gjh0rmJiYFPsezs7Owttvv/3CzyLSN2y5IdITderUwbvvvovVq1fj0aNHOnvfDz74QP21iYkJWrVqBUEQ8P7776u329vbo379+rh9+3aR44cMGQJbW1v18zfeeAPu7u7YsWMHAOD8+fO4ceMGBg0ahMePHyM5ORnJycnIyMhAly5dcPjwYahUKo33HDVqVJlq37FjBwIDAzVuXdnY2GDkyJG4c+cOrl69WrZvQjH69esHT09P9fPAwEAEBQWpz0tb9vb2OHnyZKktb506dYIgCBq3D0syYcIELFu2DIMGDUL//v2xdOlS/Pjjj7hx4wa++eYb9X5ZWVkwNzcv9j0sLCw0WtiIDAXDDZEemTZtGvLz81/Y90YbtWrV0ngul8thYWFRpJOqXC7H06dPixzv5+en8VwikcDX1xd37twBANy4cQMAMHToUDg7O2s81q5di5ycHKSmpmq8h4+PT5lqv3v3LurXr19ke8OGDdWvl9d/zwsA6tWrpz4vbX355Ze4fPkyvLy8EBgYiFmzZhUbFiti0KBBcHNzw759+9TbLC0tkZubW+z+2dnZsLS01GkNRNUBR0sR6ZE6dergnXfewerVq4sMQQZQYkdZpVJZ4nsWN3S4uG0ASuy0WprCVpmFCxciICCg2H2eH8oMwCB/4b711lto3749tm7dij179mDhwoVYsGABtmzZgp49e+rsc7y8vPDkyRP1c3d3dyiVSiQmJsLFxUW9PTc3F48fPza60XdkHBhuiPTMtGnT8Msvv2DBggVFXisc0puSkqKxvSItGC9S2DJTSBAE3Lx5E82aNQMA1K1bFwBgZ2eHkJAQnX527dq1ERMTU2R7dHS0+vXy+u95AcD169fh7e1d6nGljcRyd3fHmDFjMGbMGCQmJqJFixaYO3euzsKNIAi4c+cOmjdvrt5WGCjPnDmDXr16qbefOXMGKpWqxMBJpM94W4pIz9StWxfvvPMOvv32W8THx2u8ZmdnBycnpyKjmp7vg6FrP/30E9LS0tTPN2/ejEePHql/Ybds2RJ169bFV199hfT09CLHJyUllfuze/XqhVOnTiEyMlK9LSMjA6tXr4a3tzcaNWpU7vfetm2bxqzQp06dwsmTJ18YRKytrYuES6VSWeTWm4uLCzw8PJCTk6Peps1Q8OK+bytXrkRSUpLG7MudO3eGo6MjVq5cWWRfKysrvPLKKy/8LCJ9w5YbIj00depU/Pzzz4iJiUHjxo01Xvvggw8wf/58fPDBB2jVqhUOHz6M69evV1otjo6OaNeuHYYPH46EhAQsXboUvr6+6iHcUqkUa9euRc+ePdG4cWMMHz4cnp6eePDgAQ4cOAA7Ozv8/fff5frsyZMn47fffkPPnj3x0UcfwdHRET/++CNiY2Pxxx9/QCot///ffH190a5dO4wePRo5OTlYunQpatSogc8++6zU41q2bIl9+/Zh8eLF8PDwgI+PD+rXr4+aNWvijTfegL+/P2xsbLBv3z6cPn1aY/6gU6dO4eWXX8bMmTNf2Km4du3aGDBgAJo2bQoLCwscPXoUGzZsQEBAgHr4OVBwi2/OnDkYO3Ys3nzzTXTv3h1HjhzBL7/8grlz58LR0bHc3yOi6orhhkgP+fr64p133sGPP/5Y5LUZM2YgKSkJmzdvxu+//46ePXti586dGv0tdGnKlCm4ePEiwsPDkZaWhi5duuCbb76BlZWVep9OnTohMjISc+bMwfLly5Geng43NzcEBQVp/CLWlqurK44fP45JkyZh2bJlyM7ORrNmzfD3339XuEViyJAhkEqlWLp0KRITExEYGIjly5fD3d291OMWL16MkSNHYtq0acjKysLQoUOxevVqjBkzBnv27MGWLVugUqng6+uLb775BqNHjy5XfYMHD8bx48fxxx9/IDs7G7Vr18Znn32GqVOnanzvAWDMmDEwMzPDokWL8Ndff8HLywtLlizBhAkTyvXZRNWdRChPD0EiIiKiaop9boiIiMigMNwQERGRQWG4ISIiIoPCcENEREQGheGGiIiIDArDDRERERkUo5vnRqVS4eHDh7C1tS11mnQiIiKqPgRBQFpaGjw8PF44QafRhZuHDx/Cy8tL7DKIiIioHO7du4eaNWuWuo/RhRtbW1sABd8cOzs7kashIiKislAoFPDy8lL/Hi+N0YWbwltRdnZ2DDdERER6pixdStihmIiIiAwKww0REREZFIYbIiIiMihG1+emrJRKJfLy8sQug8rJ3Nz8hUMFiYjIMDHc/IcgCIiPj0dKSorYpVAFSKVS+Pj4wNzcXOxSiIioijHc/EdhsHFxcYGVlRUn+tNDhRM1Pnr0CLVq1eI1JCIyMgw3z1EqlepgU6NGDbHLoQpwdnbGw4cPkZ+fDzMzM7HLISKiKsROCc8p7GNjZWUlciVUUYW3o5RKpciVEBFRVWO4KQZvY+g/XkMiIuPFcENEREQGRdRwc/jwYfTu3RseHh6QSCTYtm3bC485ePAgWrRoAZlMBl9fX6xbt67S6yQiIiL9IWq4ycjIgL+/P1asWFGm/WNjY/HKK6/g5Zdfxvnz5zFx4kR88MEH2L17dyVXWv0NGzYM/fr1K7L94MGDkEgkSElJUX9d3CM+Ph4AMGvWLPU2ExMTeHl5YeTIkXjy5Empn//8caampnByckKHDh2wdOlS5OTkFNn/ypUreOutt+Ds7AyZTIZ69ephxowZyMzM1NjP29sbEokEJ06c0Ng+ceJEdOrUSbtvEhERGQVRR0v17NkTPXv2LPP+q1atgo+PDxYtWgQAaNiwIY4ePYolS5age/fulVWmwYmJiSmyaKiLi4v668aNG2Pfvn1QKpW4du0a3nvvPaSmpmLjxo2lvm/hcSqVCo8fP8bBgwfxxRdf4Oeff8bBgwfVK7meOHECISEhCAkJwfbt2+Hq6opTp07hk08+QUREBA4cOKAxP42FhQUmTZqEQ4cO6fC7QGS8BEFAWk4+0rLzIQiC2OWQATI3lcLF1kK0z9eroeCRkZEICQnR2Na9e3dMnDixxGNycnI0Wg4UCkVllac3XFxcYG9vX+LrpqamcHNzAwB4enrizTffxA8//PDC933+OA8PDzRt2hRdu3aFv78/FixYgC+++AKCIOD9999Hw4YNsWXLFvUswrVr10a9evXQvHlzLFmyBJMmTVK/78iRI7Fq1Srs2LEDvXr1qsCZExmm7DwlEhTZeJSajScZuXiamYuUzDykZObi6bM/UzLz/t2elQeliqGGKk+LWvbYMqataJ+vV+EmPj4erq6uGttcXV2hUCiQlZUFS0vLIseEh4dj9uzZ5f5MQRCQlSfOcGJLMxPRR/3cuXMHu3fvLvdMvw0aNEDPnj2xZcsWfPHFFzh//jyuXr2K9evXF1kewd/fHyEhIfjtt980wo2Pjw9GjRqFsLAw9OjRg8sqkF56mpGLP6Lu4+8LD5GnFGBvZQYHK3PYW5mpv5ZbPr/NHA5WZjAzlSJRkYP41Gw8Ss0q+FOR/ex5NhIUBYGmPMxNpODAQqoMZibi/pzWq3BTHmFhYQgNDVU/VygU8PLyKvPxWXlKNJohTp+eq593h5V52S/RP//8AxsbG41txc3zUrNmTY3ntWvXxpUrV9TPL126BBsbGyiVSmRnZwMAFi9erE3pGho0aIA9e/YAAK5fvw6g4JZicQpvNf7XtGnT8MMPP+DXX3/Fu+++W+5aiKqSIAg4GfsEv52Kw85L8chVqirtsyzMpHCXW8LJxhz2VuawtzSDg/WzoGRZEJTkz0JUYYCyMDOptHqIxKRX4cbNzQ0JCQka2xISEmBnZ1dsqw0AyGQyyGSyqihPdC+//DJWrlypse3kyZN45513NLYdOXJE3f8FQJEZfOvXr4+//voL2dnZ+OWXX3D+/HmMHz8eABAXF4dGjRqp950yZQqmTJlSal2CIBRpgSrtPn9xrUTOzs749NNPMWPGDAwYMKDUzyMS25OMXGyJuo/1p+JwOylDvb2Jpx0GBtaCp72lxm2j1KyCW0ZPM/OQ+uzPp5m5SMvOBwDYWpjCXW4BN7kl3OxkcJNbPntuAXe5BdztLGFnaSp6Sy9RdaFX4SY4OBg7duzQ2LZ3714EBwdX2mdampng6ufidFa21PJ/VdbW1vD19dXYdv/+/SL7+fj4lNrnxtzcXP0+8+fPxyuvvILZs2djzpw58PDwwPnz59X7Ojo6vrCua9euwcfHBwDg5+en3ta8efNi961Xr16x7xMaGopvvvkG33zzzQs/k6iqFbbSrD8Zh12X/22lsTY3QZ8ATwwKrIWmNeVavWe+UoVcpUqrFlwiEjncpKen4+bNm+rnsbGxOH/+PBwdHVGrVi2EhYXhwYMH+OmnnwAAo0aNwvLly/HZZ5/hvffew/79+/H7779j+/btlVajRCIx+h8s06ZNQ+fOnTF69Gh4eHgUCVCliY6Oxq5duxAWFgYAaN68ORo0aIAlS5bg7bff1ug/c+HCBezbtw/Lly8v9r1sbGwwffp0zJo1C3369KnYSRHpSHJ6DrZGPcBvp+JwO/nfVpqmnnIMCqqF3v4esJGV72eIqYkUpiL3XSDSR6L+1j5z5gxefvll9fPCvjFDhw7FunXr8OjRI8TFxalf9/Hxwfbt2/Hxxx/j66+/Rs2aNbF27VoOA9dSYmKiui9NoRo1apS4wGRwcDCaNWuGefPmlRg8ACA/Px/x8fFFhoIHBATg//7v/wAUhMW1a9eiW7du6N+/P8LCwuDm5oaTJ0/ik08+Qffu3fHhhx+W+BkjR47EkiVLsH79egQFBZXj7IkqLkGRjd1X4rHj0iOcin2CwoFHFWmlISLdETXcdOrUqdS+F8XNPtypUyecO3euEqsyfPXr1y+yLTIyEm3atCnxmI8//hjDhg3DpEmTSuyQfeXKFbi7u8PExARyuRyNGjVCWFgYRo8erdHvqW3btjhx4gRmz56Nnj17qicIHDduHJYsWQITk5Jvx5mZmWHOnDkYNGhQWU+XSCcepmRh5+V47Lz0CGfjnuL5H13+XvZ4u7VXhVppiEh3JIKRzeCkUCggl8uRmppaZCK77OxsxMbGwsfHBxYW4k0+ZGxUKhXef/997N69G4cOHVL3y6kIXkvShbjHmdh5+RF2XI7HhXspGq+1qGWPXk3d0b2xG7wcrcQpkMiIlPb7+7/4XwwSnVQqxXfffYdly5bhyJEjOgk3ROWVmpWHX0/exY5Lj3D5wb+TfkokQGtvR/Rq4obuTdzgLi9+hCYRiY/hhqoFqVSKCRMmiF0GGTGlSsDG0/fw1Z4Y9aR4UgkQXLcGejZxR7fGrqJOJ09EZcdwQ0RG7/SdJ5j11xVceVjQUlPX2RoftK+Dbo1cUcPGOObJIjIkDDdEZLQepmQhfGc0/r7wEEDBZHkTQ+phSHBt0aePJ6LyY7gphpH1sTZIvIZUmuw8JVYfvo2VB28hK08JiQR4u7UXPu1Wny01RAaA4eY5hfO8ZGZmlricA+mH3NyCPhOlDSsn4yMIAnZejsfc7dfwICULANDa2wEzezdGE0/OS0NkKBhunmNiYgJ7e3skJiYCAKysrLhWix5SqVRISkqClZUVTE35V5wKRMcrMPuvq4i8/RgA4C63QFivhujdzJ3/zokMDH/y/4ebmxsAqAMO6SepVIpatWrxlxZBpRKwYFc01hy5DZUAyEyl+LBDHYzqVNfol1YhMlT8l/0fEokE7u7ucHFxQV5entjlUDmZm5trrFtFxkkQBHz+z1WsO34HANCrqRvCejbkpHtEBo7hpgQmJibsr0Gk5xbtua4ONl+96Y83WtYUtyAiqhL8ry0RGaSVB29h+YGbAIA5fRsz2BAZEYYbIjI4P0XewYJd0QCAyT0b4N1gb3ELIqIqxXBDRAZl89n7mPHnFQDA+M6+GNWxrsgVEVFVY7ghIoOx49IjfLb5AgBgeFtvhHatJ3JFRCQGhhsiMggHohMxYcM5qARgQCsvzHi1EacCIDJSDDdEpPcibz3GqF/OIk8poLe/B+a93pTBhsiIMdwQkV47F/cUH/x4Gjn5KoQ0dMHit/xhImWwITJmDDdEpLeuPlRg6PenkJGrRFvfGlg+qAVX8yYihhsi0k+3ktLx7ncnocjOR8vaDlj9bitYmHHiTSJiuCEiPXT3cQbeWXsSjzNy0djDDt8Paw1rGSdcJ6IC/GlARHrlyI0kjFt/DqlZefB1scFP7wVCbmkmdllEVI0w3BCRXhAEAd8evo0vd0VDJQD+XvZY825L1LCRiV0aEVUzDDdEVO1l5ubj/zZfxPaLjwAAb7Wqic/7NmEfGyIqFsMNEVVrcY8zMfLnM4iOT4OpVIKZfRrjnaBanMeGiErEcENE1dbh60kY/1tB/xonGxlWvtMCrb0dxS6LiKo5hhsiqnb+278mwMseq95pCTe5hdilEZEeYLghomrlv/1rBrTywuf9GkNmyv41RFQ2DDdEVG3cfZyBD38+i+j4NJiZSDCzd2MMZv8aItISww0RVQvP969xtpVh5eAWaMX+NURUDgw3RCS6e08y8cGPZ5CrVKF5rYL+Na527F9DROXDcENEovv+WCxylSoEejvi5w8C2b+GiCqEa0sRkahSM/Ow8fQ9AMC4zr4MNkRUYQw3RCSq9afikJmrRAM3W7T3cxK7HCIyAAw3RCSa3HwV1h2PBQB80L4OR0URkU4w3BCRaP6+8BAJihy42MrQx99D7HKIyEAw3BCRKARBwJojtwEAw9p6w9yUP46ISDf404SIRHH0ZjKi49NgZW6CwYG1xS6HiAwIww0RiWL14YJWm7daeUFuZSZyNURkSBhuiKjKXXukwJEbyZBKgPfb+YhdDhEZGIYbIqpya48UjJDq2cQdXo5WIldDRIaG4YaIqlSCIht/XXgAAPigPVttiEj3GG6IqEqtO34HeUoBrb0d0LyWg9jlEJEBYrghoiqTkZOPX0/cBQCMaF9H5GqIyFAx3BBRlfn9zD0osvPh42SNkIauYpdDRAaK4YaIqkS+UoXvjxV0JH6/nQ+kUi61QESVg+GGiKrE7isJuPckCw5WZujfoqbY5RCRAWO4IaJKJwgCVj9bauHdYG9YmpuIXBERGTKGGyKqdGfuPsWFeykwN5ViSDCXWiCiysVwQ0SVbs2zpRb6t/CEk41M5GqIyNAx3BBRpYpNzsDeawkAgPfbcfg3EVU+hhsiqlTfHb0NQQC6NHCBr4uN2OUQkRFguCGiSvMkIxebztwHAHzASfuIqIow3BBRpfnlxF3k5KvQ1FOONnUcxS6HiIwEww0RVYrsPCV+PH4HQMECmRIJJ+0joqrBcENElWLruQd4nJELT3tL9GrqLnY5RGRERA83K1asgLe3NywsLBAUFIRTp06Vuv/SpUtRv359WFpawsvLCx9//DGys7OrqFoiKovcfBVWHLgJABje1htmJqL/qCEiIyLqT5yNGzciNDQUM2fORFRUFPz9/dG9e3ckJiYWu//69esxefJkzJw5E9euXcN3332HjRs3YsqUKVVcORGV5rdTcbj/NAsutjIMDuKkfURUtUQNN4sXL8aIESMwfPhwNGrUCKtWrYKVlRW+//77Yvc/fvw42rZti0GDBsHb2xvdunXDwIEDX9jaQ0RVJyMnH8v2F7TafNTFj0stEFGVEy3c5Obm4uzZswgJCfm3GKkUISEhiIyMLPaYl156CWfPnlWHmdu3b2PHjh3o1atXiZ+Tk5MDhUKh8SCiyvPDsVgkp+egdg0rDGjtJXY5RGSETMX64OTkZCiVSri6umpsd3V1RXR0dLHHDBo0CMnJyWjXrh0EQUB+fj5GjRpV6m2p8PBwzJ49W6e1E1Hxnmbk4ttDBUsthHatx742RCQKvfrJc/DgQcybNw/ffPMNoqKisGXLFmzfvh1z5swp8ZiwsDCkpqaqH/fu3avCiomMy6pDt5CWk4+G7nbo3cxD7HKIyEiJ1nLj5OQEExMTJCQkaGxPSEiAm5tbscdMnz4d7777Lj744AMAQNOmTZGRkYGRI0di6tSpkEqLZjWZTAaZjAv1EVW2+NRsrHs2r81n3etDKuW8NkQkDtFabszNzdGyZUtERESot6lUKkRERCA4OLjYYzIzM4sEGBOTgs6KgiBUXrFE9EJfR9xATr4Krb0d0Km+s9jlEJERE63lBgBCQ0MxdOhQtGrVCoGBgVi6dCkyMjIwfPhwAMCQIUPg6emJ8PBwAEDv3r2xePFiNG/eHEFBQbh58yamT5+O3r17q0MOEVW920np+P1MwS3fz3o04GzERCQqUcPNgAEDkJSUhBkzZiA+Ph4BAQHYtWuXupNxXFycRkvNtGnTIJFIMG3aNDx48ADOzs7o3bs35s6dK9YpEBGAxXuvQ6kS0LmBC1p7cw0pIhKXRDCy+zkKhQJyuRypqamws7MTuxwivXf5QSpeXXYUEgmw46P2aOjOf1dEpHva/P7Wq9FSRFT9LNwdAwDo4+/BYENE1QLDDRGV24nbj3HoehJMpRKEdq0ndjlERAAYboionARBwJe7CibcfDvQC7VrWItcERFRAYYbIiqXfdcSERWXAgszKT7q7Cd2OUREagw3RKQ1pUrAV8/62gxv6wMXOwuRKyIi+hfDDRFp7c/zDxCTkAY7C1OM6lBX7HKIiDQw3BCRVnLzVVi89zoAYHQnX8itzESuiIhIE8MNEWnlt1NxuP80Cy62Mgx7yVvscoiIimC4IaIyy8jJx7L9NwEAH3Xxg6U5lz0houqH4YaIyuyHY7FITs9B7RpWGNDaS+xyiIiKxXBDRGWSmpmHbw/dBgCEdq0HMxP++CCi6ok/nYioTPZdS0BaTj58XWzQu5mH2OUQEZWI4YaIyuRATCIAoEdjN0ilEpGrISIqGcMNEb1QvlKFIzeSAQAvN3AWuRoiotIx3BDRC124n4LUrDzILc0Q4OUgdjlERKViuCGiFzoQnQQAaO/nBBPekiKiao7hhohe6OD1gv42L9d3EbkSIqIXY7gholIlpmXj8gMFAKBDPfa3IaLqj+GGiEp1KKbgllSzmnI428pEroaI6MUYboioVAefhZtObLUhIj3BcENEJSoYAl4Qbjqyvw0R6QmGGyIq0bl7KVBk58PeygwBXvZil0NEVCYMN0RUogPRBaOkOvg5cwg4EekNhhsiKlFhfxvOSkxE+oThhoiKlaDIxtVHCkgkBS03RET6guGGiIqlHgLuKUcNGw4BJyL9wXBDRMUqnJW4E0dJEZGeYbghoiLylCocuV6wCnin+rwlRUT6heGGiIqIuvsUaTn5cLQ2R7Oa9mKXQ0SkFYYbIiriwLP+Nh24CjgR6SGGGyIq4mDMs1XAG7C/DRHpH4YbItIQn5qN6Pg0SCRAew4BJyI9xHBDRBoKW238a9rD0dpc5GqIiLTHcENEGtSzEnMIOBHpKYYbIlLLzVfh6E0OASci/cZwQ0RqZ+8+RXpOPmpYm6Opp1zscoiIyoXhhojUCmcl7ljPGVIOASciPcVwQ0RqB6ML+tt05C0pItJjDDdEBAB4mJKFmIQ0SLkKOBHpOYYbIgLw7yipAC97OHAIOBHpMYYbIgLw3KzEHAJORHqO4YaIkJuvwjH1EHCGGyLSbww3RIQzd54gI1cJJxtzNPawE7scIqIKYbghIhy8/myUVD0XDgEnIr3HcENEOBBd0N+GsxITkSFguCEycvefZuJGYjqHgBORwWC4ITJyhUPAW9RygNzKTORqiIgqjuGGyMgVhhvekiIiQ8FwQ2TEcvKVOH6LQ8CJyLAw3BAZsdOxT5GZq4SzrYxDwInIYDDcEBmxwlmJO9VzhkTCIeBEZBjKFW5SUlKwdu1ahIWF4cmTJwCAqKgoPHjwQKfFEVHl2q8eAs5bUkRkOEy1PeDixYsICQmBXC7HnTt3MGLECDg6OmLLli2Ii4vDTz/9VBl1EpGO3UxMx+3kDJibSNGhnpPY5RAR6YzWLTehoaEYNmwYbty4AQsLC/X2Xr164fDhwzotjogqz56r8QCAl3xrwNaCQ8CJyHBoHW5Onz6NDz/8sMh2T09PxMfH66QoIqp8e64kAAC6NXITuRIiIt3SOtzIZDIoFIoi269fvw5nZ86TQaQP4lOzcf5eCiQSIKQR+9sQkWHROtz06dMHn3/+OfLy8gAAEokEcXFxmDRpEvr37691AStWrIC3tzcsLCwQFBSEU6dOlbp/SkoKxo4dC3d3d8hkMtSrVw87duzQ+nOJjNneawWtNs297OFia/GCvYmI9IvW4WbRokVIT0+Hi4sLsrKy0LFjR/j6+sLW1hZz587V6r02btyI0NBQzJw5E1FRUfD390f37t2RmJhY7P65ubno2rUr7ty5g82bNyMmJgZr1qyBp6entqdBZNT2XCm4hdytMW9JEZHhkQiCIJTnwGPHjuHChQtIT09HixYtEBISovV7BAUFoXXr1li+fDkAQKVSwcvLC+PHj8fkyZOL7L9q1SosXLgQ0dHRMDMrXwdIhUIBuVyO1NRU2Nlx0jIyPqlZeWg5Zy/yVQL2f9IRdZxtxC6JiOiFtPn9rXXLzU8//YScnBy0bdsWY8aMwWeffYaQkBDk5uZqNQw8NzcXZ8+e1QhFUqkUISEhiIyMLPaYv/76C8HBwRg7dixcXV3RpEkTzJs3D0qlssTPycnJgUKh0HgQGbODMYnIVwnwdbFhsCEig6R1uBk+fDhSU1OLbE9LS8Pw4cPL/D7JyclQKpVwdXXV2O7q6lriqKvbt29j8+bNUCqV2LFjB6ZPn45Fixbhiy++KPFzwsPDIZfL1Q8vL68y10hkiPZcLRwl5fqCPYmI9JPW4UYQhGKnab9//z7kcrlOiiqJSqWCi4sLVq9ejZYtW2LAgAGYOnUqVq1aVeIxYWFhSE1NVT/u3btXqTUSVWc5+UocfDYrMfvbEJGhKvMMxc2bN4dEIoFEIkGXLl1gavrvoUqlErGxsejRo0eZP9jJyQkmJiZISEjQ2J6QkAA3t+J/6Lq7u8PMzAwmJibqbQ0bNkR8fDxyc3Nhbm5e5BiZTAaZTFbmuogM2fFbj5GRq4SrnQzNPCv3PyNERGIpc7jp168fAOD8+fPo3r07bGz+vVdvbm4Ob29vrYaCm5ubo2XLloiIiFC/t0qlQkREBMaNG1fsMW3btsX69euhUqkglRY0Ol2/fh3u7u7FBhsi0lQ4cV/XRq6QSrlQJhEZpjKHm5kzZwIAvL29MWDAAI2lF8orNDQUQ4cORatWrRAYGIilS5ciIyND3XdnyJAh8PT0RHh4OABg9OjRWL58OSZMmIDx48fjxo0bmDdvHj766KMK10Jk6FQqAXuvclZiIjJ8Wi+cOXToUJ19+IABA5CUlIQZM2YgPj4eAQEB2LVrl7qTcVxcnLqFBgC8vLywe/dufPzxx2jWrBk8PT0xYcIETJo0SWc1ERmqc/dSkJyeA1uZKdrUqSF2OURElUbreW6USiWWLFmC33//HXFxccjNzdV4/cmTJzotUNc4zw0Zq/Cd1/Dtodvo4++B/w1sLnY5RERaqdR5bmbPno3FixdjwIABSE1NRWhoKF5//XVIpVLMmjWrvDUTUSUSBOHfhTIbcwg4ERk2rcPNr7/+ijVr1uCTTz6BqakpBg4ciLVr12LGjBk4ceJEZdRIRBV0KykdsckZMDeRomM9LnBLRIZN63ATHx+Ppk2bAgBsbGzUE/q9+uqr2L59u26rIyKd2P2s1eYl3xqwtSjf0iVERPpC63BTs2ZNPHr0CABQt25d7NmzBwBw+vRpzidDVE3t4SgpIjIiWoeb1157DREREQCA8ePHY/r06fDz88OQIUPw3nvv6bxAIqqY+NRsXLiXAokECGnkInY5RESVTuuh4PPnz1d/PWDAANSuXRvHjx+Hn58fevfurdPiiKji9l4tWKutRS0HuNhWfH4qIqLqTqtwk5eXhw8//BDTp0+Hj48PAKBNmzZo06ZNpRRHRBXHhTKJyNhodVvKzMwMf/zxR2XVQkQ6lpqVh8hbjwFwoUwiMh5a97np168ftm3bVgmlEJGuHYxJRL5KgJ+LDXycrMUuh4ioSmjd58bPzw+ff/45jh07hpYtW8LaWvMHJtd5Iqo+OHEfERkjrZdfKOxrU+ybSSS4fft2hYuqTFx+gYxFdp4SLefsRUauEn+ObQt/L3uxSyIiKjdtfn9r3XITGxtb7sKIqOpE3nqMjFwl3Ows0NRTLnY5RERVRus+N0SkH/Y8GwLetZErpFKJyNUQEVUdhhsiA6RUCdh7lf1tiMg4MdwQGaDz954iOT0XthamCPKpIXY5RERViuGGyAAVjpLq3MAF5qb8Z05ExoU/9YgMjCAI2H2loL8NF8okImOkdbjZtWsXjh49qn6+YsUKBAQEYNCgQXj69KlOiyMi7d1MTMedx5kwN5GiY31nscshIqpyWoeb//u//4NCoQAAXLp0CZ988gl69eqF2NhYhIaG6rxAItJO4VpSbX1rwEam9WwPRER6r1zz3DRq1AgA8Mcff+DVV1/FvHnzEBUVhV69eum8QCLSzp7CW1JcS4qIjJTWLTfm5ubIzMwEAOzbtw/dunUDADg6OqpbdIhIHI9Ss3DhfiokEqBLQxexyyEiEoXWLTft2rVDaGgo2rZti1OnTmHjxo0AgOvXr6NmzZo6L5CIyu6fC48AAC1qOcDF1kLkaoiIxKF1y83y5cthamqKzZs3Y+XKlfD09AQA7Ny5Ez169NB5gURUNoIg4LdTcQCAN1ryPxpEZLy0XjhT33HhTDJUJ24/xturT8Da3AQnp4awMzERGRRtfn9r3XITFRWFS5cuqZ//+eef6NevH6ZMmYLc3FztqyUinShstekT4MlgQ0RGTetw8+GHH+L69esAgNu3b+Ptt9+GlZUVNm3ahM8++0znBRLRiz3JyMXOSwWjpAYF1hK5GiIicWkdbq5fv46AgAAAwKZNm9ChQwesX78e69atwx9//KHr+oioDLZE3UeuUoUmnnZoWlMudjlERKLSOtwIggCVSgWgYCh44dw2Xl5eSE5O1m11RPRCgiBg/bNbUgPZakNEpH24adWqFb744gv8/PPPOHToEF555RUABZP7ubq66rxAIirdqdgnuJ2UAStzE/Tx9xC7HCIi0WkdbpYuXYqoqCiMGzcOU6dOha+vLwBg8+bNeOmll3ReIBGVrrAjcd8AD9hamIlcDRGR+LQeUtGsWTON0VKFFi5cCBMTE50URURl8zQjFzsuF3Qk5i0pIqICWrfcAEBKSgrWrl2LsLAwPHnyBABw9epVJCYm6rQ4IirdlnMPkJuvQmMPOzT1ZEdiIiKgHC03Fy9eRJcuXWBvb487d+5gxIgRcHR0xJYtWxAXF4effvqpMuokov8QBAHrT94FUNBqI5FIRK6IiKh60LrlJjQ0FMOHD8eNGzdgYfHv2jW9evXC4cOHdVocEZXs9J2nuJWUAUszE/QNYEdiIqJCWoeb06dP48MPPyyy3dPTE/Hx8TopioheTD0jsT87EhMRPU/rcCOTyaBQKIpsv379OpydnXVSFBGVLiUzF9svFawAPjCIHYmJiJ6ndbjp06cPPv/8c+Tl5QEAJBIJ4uLiMGnSJPTv31/nBRJRUVuiCjoSN3S3gz9nJCYi0qB1uFm0aBHS09Ph4uKCrKwsdOzYEb6+vrC1tcXcuXMro0Yieo4gCOpbUoOC2JGYiOi/tB4tJZfLsXfvXhw7dgwXLlxAeno6WrRogZCQkMqoj4j+4+zdp7iRmM6OxEREJdA63BRq27Yt2rZtq8taiKgM1p8saLXp7e8OO3YkJiIqQuvbUh999BH+97//Fdm+fPlyTJw4URc1EVEJUjJz8U9hR2LOSExEVCytw80ff/xRbIvNSy+9hM2bN+ukKCIq3tZnMxI3cLNFgJe92OUQEVVLWoebx48fQy4vOjrDzs4OycnJOimKiIpiR2IiorLROtz4+vpi165dRbbv3LkTderU0UlRRFRUVNxTXE9Ih4WZFH0DPMUuh4io2tK6Q3FoaCjGjRuHpKQkdO7cGQAQERGBRYsWYenSpbquj4ieWX/yHgCgdzMPyC3ZkZiIqCRah5v33nsPOTk5mDt3LubMmQMA8Pb2xsqVKzFkyBCdF0hEQGpmHv65+BAAZyQmInqRcg0FHz16NEaPHo2kpCRYWlrCxsZG13UR0XO2nruPnGcdiZuzIzERUam0DjexsbHIz8+Hn5+fxlpSN27cgJmZGby9vXVZH5HRK+hIXHBLamAgOxITEb2I1h2Khw0bhuPHjxfZfvLkSQwbNkwXNRHRc6LiUhCTkAaZqRT9mrMjMRHRi2gdbs6dO1fsPDdt2rTB+fPndVETET2ncPj3q+xITERUJlqHG4lEgrS0tCLbU1NToVQqdVIUERXIU6qwUz0jsZfI1RAR6Qetw02HDh0QHh6uEWSUSiXCw8PRrl07nRZHZOwuPUhFRq4ScksztKjlIHY5RER6QesOxQsWLECHDh1Qv359tG/fHgBw5MgRKBQK7N+/X+cFEhmzE7cfAwCCfBwhlbIjMRFRWWjdctOoUSNcvHgRb731FhITE5GWloYhQ4YgOjoaTZo0qYwaiYxW5K2CcBNct4bIlRAR6Y9yzXPj4eGBefPm6boWInpObr4KZ+48BcBwQ0SkDa3DzeHDh0t9vUOHDuUuhoj+delBCrLylHCwMkM9F1uxyyEi0htah5tOnToV2fb8pGLlGTG1YsUKLFy4EPHx8fD398eyZcsQGBj4wuM2bNiAgQMHom/fvti2bZvWn0tUnRXekmpTpwb72xARaUHrPjdPnz7VeCQmJmLXrl1o3bo19uzZo3UBGzduRGhoKGbOnImoqCj4+/uje/fuSExMLPW4O3fu4NNPP1V3aiYyNJG32d+GiKg8tA43crlc4+Hk5ISuXbtiwYIF+Oyzz7QuYPHixRgxYgSGDx+ORo0aYdWqVbCyssL3339f4jFKpRKDBw/G7NmzUadOHa0/k6i6y8lX4uzdgv42beow3BARaUPrcFMSV1dXxMTEaHVMbm4uzp49i5CQkH8LkkoREhKCyMjIEo/7/PPP4eLigvfff7/c9RJVZxfupSI7TwUnG3P4uXBhWiIibWjd5+bixYsazwVBwKNHjzB//nwEBARo9V7JyclQKpVwdXXV2O7q6oro6Ohijzl69Ci+++67Mi/1kJOTg5ycHPVzhUKhVY1EYijsbxNUpwYXyiQi0pLW4SYgIAASiQSCIGhsb9OmTam3knQhLS0N7777LtasWQMnJ6cyHRMeHo7Zs2dXal1EulY4eR9vSRERaU/rcBMbG6vxXCqVwtnZGRYWFlp/uJOTE0xMTJCQkKCxPSEhAW5ubkX2v3XrFu7cuYPevXurt6lUKgCAqakpYmJiULduXY1jwsLCEBoaqn6uUCjg5cU1eqj6ys5T4mzcs/ltGG6IiLSmdbipXbu2zj7c3NwcLVu2REREBPr16wegIKxERERg3LhxRfZv0KABLl26pLFt2rRpSEtLw9dff11saJHJZJDJZDqrmaiynb+Xgtx8FZxtZajrbC12OUREeqfMHYojIyPxzz//aGz76aef4OPjAxcXF4wcOVKjb0tZhYaGYs2aNfjxxx9x7do1jB49GhkZGRg+fDgAYMiQIQgLCwMAWFhYoEmTJhoPe3t72NraokmTJjA3N9f684mqm+fnt2F/GyIi7ZW55ebzzz9Hp06d8OqrrwIALl26hPfffx/Dhg1Dw4YNsXDhQnh4eGDWrFlaFTBgwAAkJSVhxowZiI+PR0BAAHbt2qXuZBwXFwepVGeDuoiqPfX8NrwlRURULhLhvz2DS+Du7o6///4brVq1AgBMnToVhw4dwtGjRwEAmzZtwsyZM3H16tXKq1YHFAoF5HI5UlNTYWdnJ3Y5RBqy85RoNmsPcpUqHPi0E3yceFuKiAjQ7vd3mZtEnj59qjFk+9ChQ+jZs6f6eevWrXHv3r1ylEtEhaLuPkWuUgVXOxm8a1iJXQ4RkV4qc7hxdXVVj5TKzc1FVFQU2rRpo349LS0NZmZmuq+QyIg8f0uK/W2IiMqnzOGmV69emDx5Mo4cOYKwsDBYWVlprOt08eLFIsOwiUg7J7ieFBFRhZW5Q/GcOXPw+uuvo2PHjrCxscGPP/6oMTrp+++/R7du3SqlSCJjkJWrxPl7KQA4eR8RUUWUOdw4OTnh8OHDSE1NhY2NDUxMTDRe37RpE2xsuAYOUXmdufsEeUoBHnIL1HJkfxsiovLSehI/uVxe7HZHR8cKF0NkzNRLLtRlfxsioorgBDJE1cTzk/cREVH5MdwQVQMZOfm4eD8VACfvIyKqKIYbomrgzN2nyFcJqOlgCS/2tyEiqhCGG6JqgLekiIh0h+GGqBrgelJERLrDcEMksrTsPFx+UNDfpg0n7yMiqjCGGyKRnbnzFEqVgFqOVvC0txS7HCIivcdwQyQy3pIiItIthhsikXE9KSIi3WK4IRKR4vn+Nmy5ISLSCYYbIhGduv0EKgHwcbKGm9xC7HKIiAwCww2RiNTrSbHVhohIZxhuiEQUqQ43XHiWiEhXGG6IRJKamYerjxQAOFKKiEiXGG6IRHIy9jEEAajrbA0XO/a3ISLSFYYbIpFEsr8NEVGlYLghEsmJ208AcH4bIiJdY7ghEsHTjFxce9bfhi03RES6xXBDJIKTsQW3pPxcbOBkIxO5GiIiw8JwQyQC3pIiIqo8DDdEIoi8xcUyiYgqC8MNURV7nJ6DmIQ0AEAQww0Rkc4x3BBVscJbUg3cbOFobS5yNUREhofhhqiK/XXhAQCgna+TyJUQERkmhhuiKhSfmo191xIBAG+19hK5GiIiw8RwQ1SFfjsVB6VKQKCPI+q52opdDhGRQWK4IaoieUoVNpyOAwC806a2yNUQERkuhhuiKhJxLQEJihw42ZijR2M3scshIjJYDDdEVeSXEwWtNgNae8HclP/0iIgqC3/CElWB20npOHozGRIJMDCwltjlEBEZNIYboirw68mCVpvO9V1Q08FK5GqIiAwbww1RJcvOU2Lz2fsA2JGYiKgqMNwQVbK/LzxEalYeajpYokM9Z7HLISIyeAw3RJXsl2e3pAYH1YaJVCJyNUREho/hhqgSXbqfigv3UmBuIsVbrWqKXQ4RkVFguCGqRL+cuAsA6NnUDTVsZCJXQ0RkHBhuiCpJalYe/ny2SCY7EhMRVR2GG6JKsiXqPrLzVKjvaotWtR3ELoeIyGgw3BBVAkEQ1HPbvBNcGxIJOxITEVUVhhuiSnDi9hPcTEyHtbkJXmvuKXY5RERGheGGqBIUdiTu19wTNjJTkashIjIuDDdEOpaoyMbuK/EA2JGYiEgMDDdEOrbx9D3kqwS0rO2Ahu52YpdDRGR0GG6IdEipEvDbqWcdidtw9W8iIjEw3BDp0P7oRDxMzYajtTl6NnEXuxwiIqPEcEOkQz8/60j8ZquasDAzEbkaIiLjxHBDpCN3H2fg8PUkSCTA4EB2JCYiEgvDDZGOrH82aV8HP2fUqmElcjVERMaL4YZIB7LzlPj9zD0AHP5NRCQ2hhsiHdh5+RGeZubB094SnRu4iF0OEZFRqxbhZsWKFfD29oaFhQWCgoJw6tSpEvdds2YN2rdvDwcHBzg4OCAkJKTU/Ymqwi8nCm5JDQz0gomU60gREYlJ9HCzceNGhIaGYubMmYiKioK/vz+6d++OxMTEYvc/ePAgBg4ciAMHDiAyMhJeXl7o1q0bHjx4UMWVExWIuJaAs3efwsxEgrdae4ldDhGR0ZMIgiCIWUBQUBBat26N5cuXAwBUKhW8vLwwfvx4TJ48+YXHK5VKODg4YPny5RgyZMgL91coFJDL5UhNTYWdHWePpYrJzlOi25LDiHuSiQ871EFYr4Zil0REZJC0+f0tastNbm4uzp49i5CQEPU2qVSKkJAQREZGluk9MjMzkZeXB0dHx8oqk6hE3x66jbgnmXCzs8D4Ln5il0NERABEXa44OTkZSqUSrq6uGttdXV0RHR1dpveYNGkSPDw8NALS83JycpCTk6N+rlAoyl8w0XPuPcnENwdvAgCmvtKQq38TEVUTove5qYj58+djw4YN2Lp1KywsLIrdJzw8HHK5XP3w8mKfCNKN2X9fRU6+Ci/VrYFXm3GpBSKi6kLUcOPk5AQTExMkJCRobE9ISICbm1upx3711VeYP38+9uzZg2bNmpW4X1hYGFJTU9WPe/fu6aR2Mm4HohOx71oCTKUSzO7TGBIJR0gREVUXooYbc3NztGzZEhEREeptKpUKERERCA4OLvG4L7/8EnPmzMGuXbvQqlWrUj9DJpPBzs5O40FUEdl5Ssz6+woA4L12PvBztRW5IiIiep7onQRCQ0MxdOhQtGrVCoGBgVi6dCkyMjIwfPhwAMCQIUPg6emJ8PBwAMCCBQswY8YMrF+/Ht7e3oiPjwcA2NjYwMbGRrTzIOOx5vBt3H2cCVc7GT5iJ2IiompH9HAzYMAAJCUlYcaMGYiPj0dAQAB27dql7mQcFxcHqfTfBqaVK1ciNzcXb7zxhsb7zJw5E7NmzarK0skI3XuSieUHCjoRT+nFTsRERNWR6PPcVDXOc0MVMfKnM9hzNQFBPo7YMLIN+9oQEVURvZnnhkifHIxJxJ6rCTCRSjCnXxMGGyKiaorhhqgMcvKVmPVXQSfi4S95ox47ERMRVVsMN0RlsObwbdx5nAlnWxkmhLATMRFRdcZwQ/QC95/+24l4aq+GsLUwE7kiIiIqDcMN0Qt88c81ZOepEOjjiL4BHmKXQ0REL8BwQ1SKQ9eTsOtKfEEn4r7sRExEpA8YbohK8Hwn4qHB3qjvxk7ERET6gOGGqARrj8QiNjkDTjYyTOzKTsRERPqC4YaoGA9SsrB8/7NOxK80gB07ERMR6Q2GG6JiLNwVjaw8JQK9HdEvwFPscoiISAsMN0T/cSMhDX9eeAgAmP5qI3YiJiLSMww3RP+xNOIGBAHo3tgVTWvKxS6HiIi0xHBD9JzoeAW2X3wEAJgYUk/kaoiIqDwYboies2TvdQDAK03d0dCdq8YTEekjhhuiZy4/SMXuKwmQSICJXD+KiEhvMdwQPVPYatPH3wN+XPWbiEhvMdwQATh/LwUR0YmQSoAJXdhqQ0SkzxhuiAAsftZq81rzmqjjbCNyNUREVBEMN2T0ztx5gsPXk2AilbDVhojIADDckNErbLV5s2VN1KphJXI1RERUUQw3ZNQibz3G8VuPYWYiwbjOvmKXQ0REOsBwQ0ZLEAT1CKkBrb1Q04GtNkREhoDhhozWsZuPcerOE5ibSjH2ZbbaEBEZCoYbMkqCIGDR3hgAwKDAWnCXW4pcERER6QrDDRmlg9eTcC4uBRZmUox5ua7Y5RARkQ4x3JDReb6vzbttasPF1kLkioiISJcYbsjo7LuWiIv3U2FlboJRHdlqQ0RkaBhuyKioVIJ6XpuhL3mjho1M5IqIiEjXGG7IqOy+Eo9rjxSwkZliZPs6YpdDRESVgOGGjIZSJWDJvoJWm/faesPB2lzkioiIqDIw3JDR2H7pEa4npMPWwhTvt2OrDRGRoWK4IaOQp1Rh6bNWmxHt60BuZSZyRUREVFkYbsjgCYKAGX9ewe2kDNhbmWF4W2+xSyIiokrEcEMGb93xO/jtVBwkEuCrN/xha8FWGyIiQ8ZwQwbtYEwi5vxzFQAQ1rMBQhq5ilwRERFVNoYbMlg3E9Mwfv05qATgzZY1MYJDv4mIjALDDRmkpxm5eP/HM0jLyUdrbwd88VoTSCQSscsiIqIqwHBDBic3X4XRv57F3ceZqOlgiVXvtITM1ETssoiIqIow3JBBEQQBM/+6jBO3n8Da3ATfDW3NJRaIiIwMww0ZlB+O3cFvp+5BIgGWDWqO+m62YpdERERVjOGGDMaBmER8sb1gZNSUng3RuQFHRhERGSOGGzIINxLS8NGzkVFvtaqJD9r7iF0SERGJhOGG9N6T50ZGBfo44ot+TTkyiojIiDHckF7LzVdh9C9nEfckE16OBSOjzE3515qIyJjxtwDprYI1oy7jZOwT2MhM8d3Q1nC0Nhe7LCIiEpmp2AUQlceTjFx8ve86Npy+B6kEWDawOeq5cmQUEREx3JCeScnMxdojsfjhWCwycpUAgCm9GuLlBi4iV0ZERNUFww3phdSsPHx/NBbfH41FWk4+AKCxhx0+6VaPQ76JiEgDww1Va+k5+Vh3LBarD9+GIrsg1DRws8XHXeuhWyNXjooiIqIiGG6oWsrIycdPkXfx7eFbSMnMAwD4udhgYkg99GziBqmUoYaIiIrHcEPVSlauEr+cuItVh27hcUYuAKCOkzUmhPjh1WYeMGGoISKiF2C4IdEJgoArDxXYeu4B/jz/AMnpBaGmdg0rTOjihz7+HjA14awFRERUNgw3JJpHqVnYdu4htp67j+sJ6ertNR0s8VFnP7zWwhNmDDVERKQlhhuqUuk5+dh56RG2nnuAyNuPIQgF281Npeja0BWvNfdEx/rODDVERFRuDDdU6fKVKhy5mYytUQ+w52o8svNU6tcCfRzxenNP9GzqDrmlmYhVEhGRoWC4oUqRlatE5O1kHIhOws7L8UhOz1G/VsfZGq8390TfAE94OVqJWCURERkihhvSmXtPMrE/OhEHYhIReesxcvL/baFxtDZHH38PvNbcE81qyjk/DRERVZpqEW5WrFiBhQsXIj4+Hv7+/li2bBkCAwNL3H/Tpk2YPn067ty5Az8/PyxYsAC9evWqwooJKFiR+8ydJzgQk4j90Ym4lZSh8bqnvSVebuCMLg1c0c7Pif1oiIioSogebjZu3IjQ0FCsWrUKQUFBWLp0Kbp3746YmBi4uBRdL+j48eMYOHAgwsPD8eqrr2L9+vXo168foqKi0KRJExHOwDhk5OTjUWo24lOzcfdJBo5cT8bRm8lIf7YUAgCYSCVoVdsBLzdwQecGLvBzsWELDRERVTmJIBSOVxFHUFAQWrdujeXLlwMAVCoVvLy8MH78eEyePLnI/gMGDEBGRgb++ecf9bY2bdogICAAq1ateuHnKRQKyOVypKamws7OTmfnkZOvRFJazot3rKbSnwsvBX9mIV6Rg/jULDxKzUZadn6xxznZmKNTfRe8XN8F7fyc2CmYiIgqhTa/v0VtucnNzcXZs2cRFham3iaVShESEoLIyMhij4mMjERoaKjGtu7du2Pbtm3F7p+Tk4OcnH9Dh0KhqHjhxbjyUIHXvzleKe9dXdjKTOEmt4Cb3AItazugcwMXNPGQcykEIiKqVkQNN8nJyVAqlXB11VzV2dXVFdHR0cUeEx8fX+z+8fHxxe4fHh6O2bNn66bgUkgAyEz1t0+JlbkJ3OSWcH8WXtztLOAqt4D7s4ernQVsLdgqQ0RE1Z/ofW4qW1hYmEZLj0KhgJeXl84/p3ktB8R80VPn70tERETaETXcODk5wcTEBAkJCRrbExIS4ObmVuwxbm5uWu0vk8kgk8l0UzARERFVe6LeRzE3N0fLli0RERGh3qZSqRAREYHg4OBijwkODtbYHwD27t1b4v5ERERkXES/LRUaGoqhQ4eiVatWCAwMxNKlS5GRkYHhw4cDAIYMGQJPT0+Eh4cDACZMmICOHTti0aJFeOWVV7BhwwacOXMGq1evFvM0iIiIqJoQPdwMGDAASUlJmDFjBuLj4xEQEIBdu3apOw3HxcVBKv23gemll17C+vXrMW3aNEyZMgV+fn7Ytm0b57ghIiIiANVgnpuqVlnz3BAREVHl0eb3t/6OXSYiIiIqBsMNERERGRSGGyIiIjIoDDdERERkUBhuiIiIyKAw3BAREZFBYbghIiIig8JwQ0RERAaF4YaIiIgMiujLL1S1wgmZFQqFyJUQERFRWRX+3i7LwgpGF27S0tIAAF5eXiJXQkRERNpKS0uDXC4vdR+jW1tKpVLh4cOHsLW1hUQi0el7KxQKeHl54d69ewa9bpUxnKcxnCPA8zQ0PE/DYQznCGh3noIgIC0tDR4eHhoLahfH6FpupFIpatasWamfYWdnZ9B/GQsZw3kawzkCPE9Dw/M0HMZwjkDZz/NFLTaF2KGYiIiIDArDDRERERkUhhsdkslkmDlzJmQymdilVCpjOE9jOEeA52loeJ6GwxjOEai88zS6DsVERERk2NhyQ0RERAaF4YaIiIgMCsMNERERGRSGGyIiIjIoDDc6smLFCnh7e8PCwgJBQUE4deqU2CXp1KxZsyCRSDQeDRo0ELusCjt8+DB69+4NDw8PSCQSbNu2TeN1QRAwY8YMuLu7w9LSEiEhIbhx44Y4xVbAi85z2LBhRa5vjx49xCm2nMLDw9G6dWvY2trCxcUF/fr1Q0xMjMY+2dnZGDt2LGrUqAEbGxv0798fCQkJIlVcPmU5z06dOhW5nqNGjRKp4vJZuXIlmjVrpp7cLTg4GDt37lS/bgjXEnjxeRrCtfyv+fPnQyKRYOLEieptur6eDDc6sHHjRoSGhmLmzJmIioqCv78/unfvjsTERLFL06nGjRvj0aNH6sfRo0fFLqnCMjIy4O/vjxUrVhT7+pdffon//e9/WLVqFU6ePAlra2t0794d2dnZVVxpxbzoPAGgR48eGtf3t99+q8IKK+7QoUMYO3YsTpw4gb179yIvLw/dunVDRkaGep+PP/4Yf//9NzZt2oRDhw7h4cOHeP3110WsWntlOU8AGDFihMb1/PLLL0WquHxq1qyJ+fPn4+zZszhz5gw6d+6Mvn374sqVKwAM41oCLz5PQP+v5fNOnz6Nb7/9Fs2aNdPYrvPrKVCFBQYGCmPHjlU/VyqVgoeHhxAeHi5iVbo1c+ZMwd/fX+wyKhUAYevWrernKpVKcHNzExYuXKjelpKSIshkMuG3334ToULd+O95CoIgDB06VOjbt68o9VSWxMREAYBw6NAhQRAKrp2ZmZmwadMm9T7Xrl0TAAiRkZFilVlh/z1PQRCEjh07ChMmTBCvqEri4OAgrF271mCvZaHC8xQEw7qWaWlpgp+fn7B3716N86qM68mWmwrKzc3F2bNnERISot4mlUoREhKCyMhIESvTvRs3bsDDwwN16tTB4MGDERcXJ3ZJlSo2Nhbx8fEa11YulyMoKMjgri0AHDx4EC4uLqhfvz5Gjx6Nx48fi11ShaSmpgIAHB0dAQBnz55FXl6exvVs0KABatWqpdfX87/nWejXX3+Fk5MTmjRpgrCwMGRmZopRnk4olUps2LABGRkZCA4ONthr+d/zLGQo13Ls2LF45ZVXNK4bUDn/No1u4UxdS05OhlKphKurq8Z2V1dXREdHi1SV7gUFBWHdunWoX78+Hj16hNmzZ6N9+/a4fPkybG1txS6vUsTHxwNAsde28DVD0aNHD7z++uvw8fHBrVu3MGXKFPTs2RORkZEwMTERuzytqVQqTJw4EW3btkWTJk0AFFxPc3Nz2Nvba+yrz9ezuPMEgEGDBqF27drw8PDAxYsXMWnSJMTExGDLli0iVqu9S5cuITg4GNnZ2bCxscHWrVvRqFEjnD9/3qCuZUnnCRjOtdywYQOioqJw+vTpIq9Vxr9Nhhsqk549e6q/btasGYKCglC7dm38/vvveP/990WsjHTh7bffVn/dtGlTNGvWDHXr1sXBgwfRpUsXESsrn7Fjx+Ly5csG0S+sNCWd58iRI9VfN23aFO7u7ujSpQtu3bqFunXrVnWZ5Va/fn2cP38eqamp2Lx5M4YOHYpDhw6JXZbOlXSejRo1Mohree/ePUyYMAF79+6FhYVFlXwmb0tVkJOTE0xMTIr06k5ISICbm5tIVVU+e3t71KtXDzdv3hS7lEpTeP2M7doCQJ06deDk5KSX13fcuHH4559/cODAAdSsWVO93c3NDbm5uUhJSdHYX1+vZ0nnWZygoCAA0LvraW5uDl9fX7Rs2RLh4eHw9/fH119/bXDXsqTzLI4+XsuzZ88iMTERLVq0gKmpKUxNTXHo0CH873//g6mpKVxdXXV+PRluKsjc3BwtW7ZERESEeptKpUJERITGPVNDk56ejlu3bsHd3V3sUiqNj48P3NzcNK6tQqHAyZMnDfraAsD9+/fx+PFjvbq+giBg3Lhx2Lp1K/bv3w8fHx+N11u2bAkzMzON6xkTE4O4uDi9up4vOs/inD9/HgD06noWR6VSIScnx2CuZUkKz7M4+ngtu3TpgkuXLuH8+fPqR6tWrTB48GD11zq/nhXv/0wbNmwQZDKZsG7dOuHq1avCyJEjBXt7eyE+Pl7s0nTmk08+EQ4ePCjExsYKx44dE0JCQgQnJychMTFR7NIqJC0tTTh37pxw7tw5AYCwePFi4dy5c8Ldu3cFQRCE+fPnC/b29sKff/4pXLx4Uejbt6/g4+MjZGVliVy5dko7z7S0NOHTTz8VIiMjhdjYWGHfvn1CixYtBD8/PyE7O1vs0sts9OjRglwuFw4ePCg8evRI/cjMzFTvM2rUKKFWrVrC/v37hTNnzgjBwcFCcHCwiFVr70XnefPmTeHzzz8Xzpw5I8TGxgp//vmnUKdOHaFDhw4iV66dyZMnC4cOHRJiY2OFixcvCpMnTxYkEomwZ88eQRAM41oKQunnaSjXsjj/HQWm6+vJcKMjy5YtE2rVqiWYm5sLgYGBwokTJ8QuSacGDBgguLu7C+bm5oKnp6cwYMAA4ebNm2KXVWEHDhwQABR5DB06VBCEguHg06dPF1xdXQWZTCZ06dJFiImJEbfocijtPDMzM4Vu3boJzs7OgpmZmVC7dm1hxIgRehfOizs/AMIPP/yg3icrK0sYM2aM4ODgIFhZWQmvvfaa8OjRI/GKLocXnWdcXJzQoUMHwdHRUZDJZIKvr6/wf//3f0Jqaqq4hWvpvffeE2rXri2Ym5sLzs7OQpcuXdTBRhAM41oKQunnaSjXsjj/DTe6vp4SQRCE8rX5EBEREVU/7HNDREREBoXhhoiIiAwKww0REREZFIYbIiIiMigMN0RERGRQGG6IiIjIoDDcEBERkUFhuCEioyeRSLBt2zaxyyAiHWG4ISJRDRs2DBKJpMijR48eYpdGRHrKVOwCiIh69OiBH374QWObTCYTqRoi0ndsuSEi0clkMri5uWk8HBwcABTcMlq5ciV69uwJS0tL1KlTB5s3b9Y4/tKlS+jcuTMsLS1Ro0YNjBw5Eunp6Rr7fP/992jcuDFkMhnc3d0xbtw4jdeTk5Px2muvwcrKCn5+fvjrr78q96SJqNIw3BBRtTd9+nT0798fFy5cwODBg/H222/j2rVrAICMjAx0794dDg4OOH36NDZt2oR9+/ZphJeVK1di7NixGDlyJC5duoS//voLvr6+Gp8xe/ZsvPXWW7h48SJ69eqFwYMH48mTJ1V6nkSkIxVe2pOIqAKGDh0qmJiYCNbW1hqPuXPnCoJQsAr2qFGjNI4JCgoSRo8eLQiCIKxevVpwcHAQ0tPT1a9v375dkEql6pXNPTw8hKlTp5ZYAwBh2rRp6ufp6ekCAGHnzp06O08iqjrsc0NEonv55ZexcuVKjW2Ojo7qr4ODgzVeCw4Oxvnz5wEA165dg7+/P6ytrdWvt23bFiqVCjExMZBIJHj48CG6dOlSag3NmjVTf21tbQ07OzskJiaW95SISEQMN0QkOmtr6yK3iXTF0tKyTPuZmZlpPJdIJFCpVJVREhFVMva5IaJq78SJE0WeN2zYEADQsGFDXLhwARkZGerXjx07BqlUivr168PW1hbe3t6IiIio0pqJSDxsuSEi0eXk5CA+Pl5jm6mpKZycnAAAmzZtQqtWrdCuXTv8+uuvOHXqFL777jsAwODBgzFz5kwMHToUs2bNQlJSEsaPH493330Xrq6uAIBZs2Zh1KhRcHFxQc+ePZGWloZjx45h/PjxVXuiRFQlGG6ISHS7du2Cu7u7xrb69esjOjoaQMFIpg0bNmDMmDFwd3fHb7/9hkaNGgEArKyssHv3bkyYMAGtW7eGlZUV+vfvj8WLF6vfa+jQocjOzsaSJUvw6aefwsnJCW+88UbVnSARVSmJIAiC2EUQEZVEIpFg69at6Nevn9ilEJGeYJ8bIiIiMigMN0RERGRQ2OeGiKo13jknIm2x5YaIiIgMCsMNERERGRSGGyIiIjIoDDdERERkUBhuiIiIyKAw3BAREZFBYbghIiIig8JwQ0RERAaF4YaIiIgMyv8DUIso5m8kRIgAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "train_her(env, agent, target_network, optimizer, td_loss_fn=td_loss_dqn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Summary\n",
    "\n",
    "In this notebook we saw how to train a DQN agent with experience replay and target networks. We also saw HER variant which augmented replay buffer with additional sub goals to make it sample efficient **Hindsight Experience Replay (HER-DQN)**. While we combined HER with DQN, it could also be combined with various other learning algorithms including ones from POlicy gradient versions from later chapters.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
